Parallel Solution of Sparse Linear Least Squares Problemson Distributed - Memory

نویسندگان

  • Chunguang Sun
  • CHUNGUANG SUN
چکیده

This paper studies the solution of large-scale sparse linear least squares problems on distributed-memory multiprocessors. The method of corrected semi-normal equations is considered. New block-oriented parallel algorithms are developed for solving the related sparse triangular systems. The arithmetic and communication complexities of the new algorithms applied to regular grid problems are analyzed. The proposed parallel sparse triangular solution algorithms together with a block-oriented parallel sparse QR factorization algorithm result in a highly eecient block-oriented approach to the parallel solution of sparse linear least squares problems on distributed-memory multiprocessors. Performance of the block-oriented approach is demonstrated empirically through an implementation on an IBM Scalable POWERparallel system SP2. The largest problem solved has over two million rows and more than a quarter million columns. The execution speed for the numerical factorization of this problem achieves over 3.7 gigaaops per second on an IBM SP2 machine with 128 processors. Abstract. This paper studies the solution of large-scale sparse linear least squares problems on distributed-memory multiprocessors. The method of corrected semi-normal equations is considered. New block-oriented parallel algorithms are developed for solving the related sparse triangular systems. The arithmetic and communication complexities of the new algorithms applied to regular grid problems are analyzed. The proposed parallel sparse triangular solution algorithms together with a block-oriented parallel sparse QR factorization algorithm result in a highly ee-cient block-oriented approach to the parallel solution of sparse linear least squares problems on distributed-memory multiprocessors. Performance of the block-oriented approach is demonstrated empirically through an implementation on an IBM Scalable POWERparallel system SP2. The largest problem solved has over two million rows and more than a quarter million columns. The execution speed for the numerical factorization of this problem achieves over 3.7 gigaaops per second on an IBM SP2 machine with 128 processors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Solution of Sparse Linear Least Squares Problems on Distributed-Memory Multiprocessors

This paper studies the solution of large-scale sparse linear least squares problems on distributed-memory multiprocessors. The method of corrected semi-normal equations is considered. New block-oriented parallel algorithms are developed for solving the related sparse triangular systems. The arithmetic and communication complexities of the new algorithms applied to regular grid problems are anal...

متن کامل

Parallel solution of large-scale free surface viscoelastic flows via sparse approximate inverse preconditioning

Though computational techniques for two-dimensional viscoelastic free surface flows are well developed, three-dimensional flows continue to present significant computational challenges. Fully coupled free surface flow models lead to nonlinear systems whose steady states can be found via Newton’s method. Each Newton iteration requires the solution of a large, sparse linear system, for which memo...

متن کامل

Solving Irregular Sparse Linear Systems On a Multicomputer Using the Cgnr Method

The eecient solution of irregular sparse linear systems on a distributed memory parallel computer is still a major challenge. Direct methods are concerned with unbalanced load processing or data distribution as well as diiculties pertaining to reuse eecient sequential codes. Iterative methods of the Krylov family are well suited for parallel computing but can provide disappointing convergence f...

متن کامل

A Parallel Interior-point Algorithm for Linear Programming on a Shared Memory Machine

The XPRESS 1 interior point optimizer is an \industrial strength" code for solution of large-scale sparse linear programs. The purpose of the present paper is to discuss how the XPRESS interior point optimizer has been parallelized for a Silicon Graphics multi processor computer. The major computational task, performed in each iteration of the interior-point method implemented in the XPRESS int...

متن کامل

Isoefficiency Analysis of CGLS Algorithms for Parallel Least Squares Problems

In this paper we study the parallelization of CGLS, a basic iterative method for large and sparse least squares problems whose main idea is to organize the computation of conjugate gradient method to normal equations. A performance model called isoeeciency concept is used to analyze the behavior of this method implemented on massively parallel distributed memory computers with two dimensional m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007